Propbank-Br: a Brazilian Portuguese corpus annotated with semantic role labels
نویسندگان
چکیده
Semantic Role Labeling is a task in Natural Language Processing often carried out through annotated corpus. So far, there is no available corpus of Portuguese annotated with semantic role labels. This paper reports the annotation of a Brazilian Portuguese corpus following Propbank guidelines. This is the first step of a larger annotation effort and aims to pave the way for a distributed annotation task. Annotation decisions are discussed to stress language specific aspects involved in the Project. Resumo. A Anotação de Papéis Semânticos é uma tarefa de Processamento de Línguas Naturais frequentemente realizada por meio de corpus anotado. Até o momento não há um corpus de português disponível que esteja anotado com rótulos de papéis semânticos. Este artigo relata a anotação de um corpus de português do Brasil seguindo as instruções do Propbank. Este é o primeiro passo de um esforço mais amplo de anotação e tem por objetivo abrir caminho para uma tarefa de anotação distribuída. As decisões de anotação são discutidas a fim de salientar os aspectos específicos de língua envolvidos no projeto.
منابع مشابه
Propbank-Br: a Brazilian Treebank annotated with semantic role labels
This paper reports the annotation of a Brazilian Portuguese Treebank with semantic role labels following Propbank guidelines. A different language and a different parser output impact the task and require some decisions on how to annotate the corpus. Therefore, a new annotation guide – called Propbank-Br has been generated to deal with specific language phenomena and parser problems. In this ph...
متن کاملAutomatic Generation of a Lexical Resource to support Semantic Role Labeling in Portuguese
This paper reports an approach to automatically generate a lexical resource to support incremental semantic role labeling annotation in Portuguese. The data come from the corpus Propbank-Br (Propbank of Brazilian Portuguese) and from the lexical resource of English Propbank, as both share the same structure. In order to enable the strategy, we added extra annotation to Propbank-Br. This approac...
متن کاملLabeling Chinese Predicates with Semantic Roles
In this article we report work on Chinese semantic role labeling, taking advantage of two recently completed corpora, the Chinese PropBank, a semantically annotated corpus of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate–argument structure of nominalized predicates. Because the semantic role labels are assigned to the constituents in a parse tree, we fi...
متن کاملA PropBank for Portuguese: the CINTIL-PropBank
With the CINTIL-International Corpus of Portuguese, an ongoing corpus annotated with fully flegded grammatical representation, sentences get not only a high level of lexical, morphological and syntactic annotation but also a semantic analysis that prepares the data to a manual specification step and thus opens the way for a number of tools and resources for which there is a great research focus...
متن کاملA Tool for Korean Semantic Annotated Corpus Construction
Despite that the semantic annotated corpus is necessary in semantic role labeling, there is no semantic annotated corpus constructed for Korean. This paper establishes a tool for the construction of the Korean semantic annotated corpus including Korean Proposition Bank (PropBank). Sejong predicate case frame dictionary was used as one of the linguistic resources, and a Korean syntactic annotate...
متن کامل